An automatic speech recognition system using neural networks and linear dynamic models to recover and model articulatory traces

نویسندگان

  • Joe Frankel
  • Korin Richmond
  • Simon King
  • Paul Taylor
چکیده

We describe a speech recognition system which uses articulatory parameters as basic features and phone-dependent linear dynamic models. The system first estimates articulatory trajectories from the speech signal. Estimations of x and y coordinates of 7 actual articulator positions in the midsagittal plane are produced every 2 milliseconds by a recurrent neural network, trained on real articulatory data. The output of this network is then passed to a set of linear dynamic models, which perform phone recognition.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Persian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods

Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...

متن کامل

ASR - articulatory speech recognition

The hidden Markov model (HMM) has proven to be the model which has made large-vocabulary automatic speech recognition (ASR) possible. The HMM is robust, versatile and has at its disposal a host of efficient algorithms which deal with training, speaker adaptation and recognition. However, there is nothing uniquely speech orientated about the HMM. In fact, certain assumptions are made of speech w...

متن کامل

Integration of multiple feature sets for reducing ambiguity in automatic speech recognition

This thesis presents a method to investigate the extent to which articulatory based acoustic features can be exploited to reduce ambiguity in automatic speech recognition search. The method proposed is based on a lattice re-scoring paradigm implemented to integrate articulatory based features into automatic speech recognition systems. Time delay neural networks are trained as feature detectors ...

متن کامل

Automatic speech recognition using dynamic bayesian networks with both acoustic and articulatory variables

Current technology for automatic speech recognition (ASR) uses hidden Markov models (HMMs) that recognize spoken speech using the acoustic signal. However, no use is made of the causes of the acoustic signal: the articulators. We present here a dynamic Bayesian network (DBN) model that utilizes an additional variable for representing the state of the articulators. A particular strength of the s...

متن کامل

شبکه عصبی پیچشی با پنجره‌های قابل تطبیق برای بازشناسی گفتار

Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000